An Integrated Approach for RNA-seq Data Normalization
نویسندگان
چکیده
BACKGROUND DNA copy number alteration is common in many cancers. Studies have shown that insertion or deletion of DNA sequences can directly alter gene expression, and significant correlation exists between DNA copy number and gene expression. Data normalization is a critical step in the analysis of gene expression generated by RNA-seq technology. Successful normalization reduces/removes unwanted nonbiological variations in the data, while keeping meaningful information intact. However, as far as we know, no attempt has been made to adjust for the variation due to DNA copy number changes in RNA-seq data normalization. RESULTS In this article, we propose an integrated approach for RNA-seq data normalization. Comparisons show that the proposed normalization can improve power for downstream differentially expressed gene detection and generate more biologically meaningful results in gene profiling. In addition, our findings show that due to the effects of copy number changes, some housekeeping genes are not always suitable internal controls for studying gene expression. CONCLUSIONS Using information from DNA copy number, integrated approach is successful in reducing noises due to both biological and nonbiological causes in RNA-seq data, thus increasing the accuracy of gene profiling.
منابع مشابه
Systematic comparison of RNA-Seq normalization methods using measurement error models
MOTIVATION Further advancement of RNA-Seq technology and its application call for the development of effective normalization methods for RNA-Seq data. Currently, different normalization methods are compared and validated by their correlations with a certain gold standard. Gene expression measurements generated by a different technology or platform such as Real-time reverse transcription polymer...
متن کاملPathway analysis for RNA-Seq data using a score-based approach.
A variety of pathway/gene-set approaches have been proposed to provide evidence of higher-level biological phenomena in the association of expression with experimental condition or clinical outcome. Among these approaches, it has been repeatedly shown that resampling methods are far preferable to approaches that implicitly assume independence of genes. However, few approaches have been optimize...
متن کاملPerformance Assessment and Selection of Normalization Procedures for Single-Cell RNA-Seq
Due to the presence of systematic measurement biases, data normalization is an essential preprocessing step in the analysis of single-cell RNA sequencing (scRNA-seq) data. While a variety of normalization procedures are available for bulk RNA-seq, their suitability with respect to single-cell data is still largely unexplored. Furthermore, there may be multiple, competing considerations behind t...
متن کاملTRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data
High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one an...
متن کاملAssessment of Single Cell RNA-Seq Normalization Methods
We have assessed the performance of seven normalization methods for single cell RNA-seq using data generated from dilution of RNA samples. Our analyses showed that methods considering spike-in External RNA Control Consortium (ERCC) RNA molecules significantly outperformed those not considering ERCCs. This work provides a guidance of selecting normalization methods to remove technical noise in s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 15 شماره
صفحات -
تاریخ انتشار 2016